Labeling Document Clusters with Thematic Phrases
نویسندگان
چکیده
منابع مشابه
Automatic Labeling of Document Clusters
Automatically labeling document clusters with words which indicate their topics is difficult to do well. The most commonly used method, labeling with the most frequent words in the clusters, ends up using many words that are virtually void of descriptive power even after traditional stop words are removed. Another method, labeling with the most predictive words, often includes rather obscure wo...
متن کاملBayesian Unsupervised Labeling of Web Document Clusters
Information technologies have recently led to a surge of electronic documents in the form of emails, webpages, blogs, news articles, etc. To help users decide which documents may be interesting to read, it is common practice to organize documents by categories/topics. A wide range of supervised and unsupervised learning techniques already exist for automated text classification and text cluster...
متن کاملText Document Topical Recursive Clustering and Automatic Labeling of a Hierarchy of Document Clusters
The overwhelming amount of textual documents available nowadays highlights the need for information organization and discovery. Effectively organizing documents into a hierarchy of topics and subtopics makes it easier for users to browse the documents. This paper borrows community mining from social network analysis to generate a hierarchy of topically coherent document clusters. It focuses on ...
متن کاملIntroducing thematic clusters of articles
T he European Journal of Psychotraumatogy (EJPT) is becoming an important repository of knowledge for the field of psychotraumatology. Since its launch in December 2010 (see Olff, 2010, Olff & Bindslev, 2011) new ideas and initiatives have developed to further increase visibility and easy access to papers published in the Journal. Most scientific journals include in their editorial strategies t...
متن کاملAutomated Document Labeling
An increasing number of publishers are using the Internet and the World Wide Web to provide their subscribers with access to online journals. New techniques are needed to capture, classify, analyze, extract, modify, and reformat Web-based document information for computer storage, access, and processing. An R&D division of the National Library of Medicine (NLM) is developing an automated system...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IARJSET
سال: 2017
ISSN: 2393-8021
DOI: 10.17148/iarjset.2017.4703